Goto

Collaborating Authors

 electron density


Machine Learning Time Propagators for Time-Dependent Density Functional Theory Simulations

Shah, Karan, Cangi, Attila

arXiv.org Artificial Intelligence

Time-dependent density functional theory (TDDFT) is a widely used method to investigate electron dynamics under external time-dependent perturbations such as laser fields. In this work, we present a machine learning approach to accelerate electron dynamics simulations based on real time TDDFT using autoregressive neural operators as time-propagators for the electron density. By leveraging physics-informed constraints and featurization, and high-resolution training data, our model achieves superior accuracy and computational speed compared to traditional numerical solvers. We demonstrate the effectiveness of our model on a class of one-dimensional diatomic molecules under the influence of a range of laser parameters. This method has potential in enabling on-the-fly modeling of laser-irradiated molecules and materials by utilizing fast machine learning predictions in a large space of varying experimental parameters of the laser.


Beyond Atoms: Evaluating Electron Density Representation for 3D Molecular Learning

Suriana, Patricia, Rackers, Joshua A., Nowara, Ewa M., Pinheiro, Pedro O., Nicoloudis, John M., Sresht, Vishnu

arXiv.org Artificial Intelligence

Machine learning models for 3D molecular property prediction typically rely on atom-based representations, which may overlook subtle physical information. Electron density maps -- the direct output of X-ray crystallography and cryo-electron microscopy -- offer a continuous, physically grounded alternative. We compare three voxel-based input types for 3D convolutional neural networks (CNNs): atom types, raw electron density, and density gradient magnitude, across two molecular tasks -- protein-ligand binding affinity prediction (PDBbind) and quantum property prediction (QM9). We focus on voxel-based CNNs because electron density is inherently volumetric, and voxel grids provide the most natural representation for both experimental and computed densities. On PDBbind, all representations perform similarly with full data, but in low-data regimes, density-based inputs outperform atom types, while a shape-based baseline performs comparably -- suggesting that spatial occupancy dominates this task. On QM9, where labels are derived from Density Functional Theory (DFT) but input densities from a lower-level method (XTB), density-based inputs still outperform atom-based ones at scale, reflecting the rich structural and electronic information encoded in density. Overall, these results highlight the task- and regime-dependent strengths of density-derived inputs, improving data efficiency in affinity prediction and accuracy in quantum property modeling.


Towards A Universally Transferable Acceleration Method for Density Functional Theory

Liu, Zhe, Ni, Yuyan, Pu, Zhichen, Sun, Qiming, Liu, Siyuan, Yan, Wen

arXiv.org Artificial Intelligence

Recently, sophisticated deep learning-based approaches have been developed for generating efficient initial guesses to accelerate the convergence of density functional theory (DFT) calculations. While the actual initial guesses are often density matrices (DM), quantities that can convert into density matrices also qualify as alternative forms of initial guesses. Hence, existing works mostly rely on the prediction of the Hamiltonian matrix for obtaining high-quality initial guesses. However, the Hamiltonian matrix is both numerically difficult to predict and intrinsically non-transferable, hindering the application of such models in real scenarios. In light of this, we propose a method that constructs DFT initial guesses by predicting the electron density in a compact auxiliary basis representation using E(3)-equivariant neural networks. Trained on small molecules with up to 20 atoms, our model is able to achieve an average 33.3% self-consistent field (SCF) step reduction on systems up to 60 atoms, substantially outperforming Hamiltonian-centric and DM-centric models. Critically, this acceleration remains nearly constant with increasing system sizes and exhibits strong transferring behaviors across orbital basis sets and exchange-correlation (XC) functionals. To the best of our knowledge, this work represents the first and robust candidate for a universally transferable DFT acceleration method. We are also releasing the SCFbench dataset and its accompanying code to facilitate future research in this promising direction.


EDBench: Large-Scale Electron Density Data for Molecular Modeling

Xiang, Hongxin, Li, Ke, Liu, Mingquan, Cheng, Zhixiang, Yao, Bin, Du, Wenjie, Xia, Jun, Zeng, Li, Jin, Xin, Zeng, Xiangxiang

arXiv.org Artificial Intelligence

Existing molecular machine learning force fields (MLFFs) generally focus on the learning of atoms, molecules, and simple quantum chemical properties (such as energy and force), but ignore the importance of electron density (ED) $ρ(r)$ in accurately understanding molecular force fields (MFFs). ED describes the probability of finding electrons at specific locations around atoms or molecules, which uniquely determines all ground state properties (such as energy, molecular structure, etc.) of interactive multi-particle systems according to the Hohenberg-Kohn theorem. However, the calculation of ED relies on the time-consuming first-principles density functional theory (DFT) which leads to the lack of large-scale ED data and limits its application in MLFFs. In this paper, we introduce EDBench, a large-scale, high-quality dataset of ED designed to advance learning-based research at the electronic scale. Built upon the PCQM4Mv2, EDBench provides accurate ED data, covering 3.3 million molecules. To comprehensively evaluate the ability of models to understand and utilize electronic information, we design a suite of ED-centric benchmark tasks spanning prediction, retrieval, and generation. Our evaluation on several state-of-the-art methods demonstrates that learning from EDBench is not only feasible but also achieves high accuracy. Moreover, we show that learning-based method can efficiently calculate ED with comparable precision while significantly reducing the computational cost relative to traditional DFT calculations. All data and benchmarks from EDBench will be freely available, laying a robust foundation for ED-driven drug discovery and materials science.


BondMatcher: H-Bond Stability Analysis in Molecular Systems

Daniel, Thomas, Olejniczak, Malgorzata, Tierny, Julien

arXiv.org Artificial Intelligence

This application paper investigates the stability of hydrogen bonds (H-bonds), as characterized by the Quantum Theory of Atoms in Molecules (QTAIM). First, we contribute a database of 4544 electron densities associated to four isomers of water hexamers (the so-called Ring, Book, Cage and Prism), generated by distorting their equilibrium geometry under various structural perturbations, modeling the natural dynamic behavior of molecular systems. Second, we present a new stability measure, called bond occurrence rate, associating each bond path present at equilibrium with its rate of occurrence within the input ensemble. We also provide an algorithm, called BondMatcher, for its automatic computation, based on a tailored, geometry-aware partial isomorphism estimation between the extremum graphs of the considered electron densities. Our new stability measure allows for the automatic identification of densities lacking H-bond paths, enabling further visual inspections. Specifically, the topological analysis enabled by our framework corroborates experimental observations and provides refined geometrical criteria for characterizing the disappearance of H-bond paths. Our electron density database and our C++ implementation are available at this address: https://github.com/thom-dani/BondMatcher.

  Country:
  Genre: Research Report (1.00)
  Industry: Energy (1.00)

ELECTRA: A Symmetry-breaking Cartesian Network for Charge Density Prediction with Floating Orbitals

Elsborg, Jonas, Thiede, Luca, Aspuru-Guzik, Alán, Vegge, Tejs, Bhowmik, Arghya

arXiv.org Artificial Intelligence

We present the Electronic Tensor Reconstruction Algorithm (ELECTRA) - an equivariant model for predicting electronic charge densities using "floating" orbitals. Floating orbitals are a long-standing idea in the quantum chemistry community that promises more compact and accurate representations by placing orbitals freely in space, as opposed to centering all orbitals at the position of atoms. Finding ideal placements of these orbitals requires extensive domain knowledge though, which thus far has prevented widespread adoption. We solve this in a data-driven manner by training a Cartesian tensor network to predict orbital positions along with orbital coefficients. This is made possible through a symmetry-breaking mechanism that is used to learn position displacements with lower symmetry than the input molecule while preserving the rotation equivariance of the charge density itself. Inspired by recent successes of Gaussian Splatting in representing densities in space, we are using Gaussians as our orbitals and predict their weights and covariance matrices. Our method achieves a state-of-the-art balance between computational efficiency and predictive accuracy on established benchmarks.


Physics-consistent machine learning: output projection onto physical manifolds

Valente, Matilde, Dias, Tiago C., Guerra, Vasco, Ventura, Rodrigo

arXiv.org Artificial Intelligence

Data-driven machine learning models often require extensive datasets, which can be costly or inaccessible, and their predictions may fail to comply with established physical laws. Current approaches for incorporating physical priors mitigate these issues by penalizing deviations from known physical laws, as in physics-informed neural networks, or by designing architectures that automatically satisfy specific invariants. However, penalization approaches do not guarantee compliance with physical constraints for unseen inputs, and invariant-based methods lack flexibility and generality. We propose a novel physics-consistent machine learning method that directly enforces compliance with physical principles by projecting model outputs onto the manifold defined by these laws. This procedure ensures that predictions inherently adhere to the chosen physical constraints, improving reliability and interpretability. Our method is demonstrated on two systems: a spring-mass system and a low-temperature reactive plasma. Compared to purely data-driven models, our approach significantly reduces errors in physical law compliance, enhances predictive accuracy of physical quantities, and outperforms alternatives when working with simpler models or limited datasets. The proposed projection-based technique is versatile and can function independently or in conjunction with existing physics-informed neural networks, offering a powerful, general, and scalable solution for developing fast and reliable surrogate models of complex physical systems, particularly in resource-constrained scenarios.


Stable and Accurate Orbital-Free DFT Powered by Machine Learning

Remme, Roman, Kaczun, Tobias, Ebert, Tim, Gehrig, Christof A., Geng, Dominik, Gerhartz, Gerrit, Ickler, Marc K., Klockow, Manuel V., Lippmann, Peter, Schmidt, Johannes S., Wagner, Simon, Dreuw, Andreas, Hamprecht, Fred A.

arXiv.org Artificial Intelligence

Hohenberg and Kohn have proven that the electronic energy and the one-particle electron density can, in principle, be obtained by minimizing an energy functional with respect to the density. Given that decades of theoretical work have so far failed to produce this elusive exact energy functional promising great computational savings, it is reasonable to try and learn it empirically. Using rotationally equivariant atomistic machine learning, we obtain for the first time a density functional that, when applied to the organic molecules in QM9, yields energies with chemical accuracy while also converging to meaningful electron densities. Augmenting the training data with densities obtained from perturbed potentials proved key to these advances. Altogether, we are now closer than ever to fulfilling Hohenberg and Kohn's promise, paving the way for more efficient calculations in large molecular systems. 1 arXiv:2503.00443v1 The proof marked a radical departure from most prior work, which had ...


2D Integrated Bayesian Tomography of Plasma Electron Density Profile for HL-3 Based on Gaussian Process

Wang, Cong, Yang, Renjie, Li, Dong, Yang, Zongyu, Wang, Zhijun, Wei, Yixiong, Li, Jing

arXiv.org Artificial Intelligence

This paper introduces an integrated Bayesian model that combines line integral measurements and point values using Gaussian Process (GP). The proposed method leverages Gaussian Process Regression (GPR) to incorporate point values into 2D profiles and employs coordinate mapping to integrate magnetic flux information for 2D inversion. The average relative error of the reconstructed profile, using the integrated Bayesian tomography model with normalized magnetic flux, is as low as 3.60*10^(-4). Additionally, sensitivity tests were conducted on the number of grids, the standard deviation of synthetic diagnostic data, and noise levels, laying a solid foundation for the application of the model to experimental data. This work not only achieves accurate 2D inversion using the integrated Bayesian model but also provides a robust framework for decoupling pressure information from equilibrium reconstruction, thus making it possible to optimize equilibrium reconstruction using inversion results.


Machine learning-guided construction of an analytic kinetic energy functional for orbital free density functional theory

Manzhos, Sergei, Luder, Johann, Ihara, Manabu

arXiv.org Machine Learning

Machine learning (ML) of kinetic energy functionals (KEF) for orbital-free density functional theory (OF-DFT) holds the promise of addressing an important bottleneck in large-scale ab initio materials modeling where sufficiently accurate analytic KEFs are lacking. However, ML models are not as easily handled as analytic expressions; they need to be provided in the form of algorithms and associated data. Here, we bridge the two approaches and construct an analytic expression for a KEF guided by interpretative machine learning of crystal cell-averaged kinetic energy densities ({\tau}) of several hundred materials. A previously published dataset including multiple phases of 433 unary, binary, and ternary compounds containing Li, Al, Mg, Si, As, Ga, Sb, Na, Sn, P, and In was used for training, including data at the equilibrium geometry as well as strained structures. A hybrid Gaussian process regression - neural network (GPR-NN) method was used to understand the type of functional dependence of {\tau} on the features which contained cell-averaged terms of the 4th order gradient expansion and the product of the electron density and Kohn-Sham effective potential. Based on this analysis, an analytic model is constructed that can reproduce Kohn-Sham DFT energy-volume curves with sufficient accuracy (pronounced minima that are sufficiently close to the minima of the Kohn-Sham DFT-based curves and with sufficiently close curvatures) to enable structure optimizations and elastic response calculations.